Chinese Named Entity Recognition with Conditional Probabilistic Models
نویسندگان
چکیده
This paper describes the work on Chinese named entity recognition performed by Yahoo team at the third International Chinese Language Processing Bakeoff. We used two conditional probabilistic models for this task, including conditional random fields (CRFs) and maximum entropy models. In particular, we trained two conditional random field recognizers and one maximum entropy recognizer for identifying names of people, places, and organizations in unsegmented Chinese texts. Our best performance is 86.2% F-score on MSRA dataset, and 88.53% on CITYU dataset.
منابع مشابه
A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملA Framework Based on Graphical Models with Logic for Chinese Named Entity Recognition
Chinese named entity recognition (NER) has recently been viewed as a classification or sequence labeling problem, and many approaches have been proposed. However, they tend to address this problem without considering linguistic information in Chinese NEs. We propose a new framework based on probabilistic graphical models with firstorder logic for Chinese NER. First, we use Conditional Random Fi...
متن کاملTwo Step Chinese Named Entity Recognition Based on Conditional Random Fields Models
This paper mainly describes a Chinese named entity recognition (NER) system NER@ISCAS, which integrates text, partof-speech and a small-vocabularycharacter-lists feature and heristic postprocess rules for MSRA NER open track under the framework of Conditional Random Fields (CRFs) model.
متن کاملRobust Algorithms for Semantic Class Labeling in Chinese Query Understanding ⋆
In this paper we propose an approach to solve the words’ variation induced by automatic speech recognition (ASR) and errors by keyboard in human-computer interaction system. Considering the characteristics of Chinese, fuzzy matching based on Chinese pinyin is utilized to correct the semantic concepts in a natural language query. The approach is in two stages: first, conditional random field (CR...
متن کاملTree Representations in Probabilistic Models for Extended Named Entities Detection
In this paper we deal with Named Entity Recognition (NER) on transcriptions of French broadcast data. Two aspects make the task more difficult with respect to previous NER tasks: i) named entities annotated used in this work have a tree structure, thus the task cannot be tackled as a sequence labelling task; ii) the data used are more noisy than data used for previous NER tasks. We approach the...
متن کامل